Data copying method and apparatus in a thin provisioned system

ABSTRACT

Data migration includes copying between normal volumes and thin provisioned volumes. Data in a normal volume can be copied to a thin provisioned volume. Alternatively, data structures can be provided to facilitate converting a normal volume into a thin provisioned volume without actual copying of data. Copying from a thin provisioned volume to a normal volume is also disclosed.

CROSS-REFERENCE TO RELATED APPLICATION

This application is related to commonly owned U.S. application Ser. No.09/931,253, filed Aug. 17, 2001, now U.S. Pat. No. 6,725,328, and isherein incorporated by reference in its entirety for all purposes

BACKGROUND OF THE INVENTION

The invention is related to storage systems and in particular tomigration in a allocation as needed (i.e., thin provisioned) storagesystem.

Allocation-on-use (allocation-as-needed, also referred to as “thinprovisioning”) technology provides an efficient storage space managementfor virtual volumes, since space is allocated on an as-needed basis.Conventional “manual provisioning” of storage involves installing theactual physical storage called for; e.g., if 10 terabytes (TB) ofstorage is required, then in a “manual provisioning” approach, 10 TB ofstorage is purchased and installed. Manually provisioned volumes arereferred to herein as “normal volumes”. Thin provisioning allows a user(e.g., administrator) to create volumes of any size without actuallypurchasing or installing the entire amount of disk storage. Thinprovisioned volumes are referred herein as “thin provisioned volumes.” Acommon use of thin provisioning is in virtual storage systems, where“virtual volumes” in the virtual storage are provided as thinprovisioned volumes. Commonly owned U.S. Pat. No. 6,725,328 shows anexample of thin provisioning, referred to therein as allocation-on-use.

Current data migration technologies for volumes such as Logical Units(LUs) in the SCSI environment perform operations on a block-by-blockbasis irrespective of the data in the blocks. If we use the currentmigration technology for thin-provisioning technology, the benefits ofthin provisioning will be lost because conventional migration technologycopies all blocks in the source volume to the target volume.Consequently, even in a thin-provisioning system, all blocks would beallocated. Improvements in this area of storage technologies can bemade.

As the amount of information treated in a computer system for use incompanies, corporations, etc. is drastically increased, the capacity ofa storage device such as a disk for storage of data has been increasedsteadily in these years. For example, a magnetic disk storage systemhaving a capacity of the order of terabytes is very common. With respectto such a disk storage system, there is a technique by which a singlestorage device subsystem is made up of a plurality of types of logicaldisks (which will be sometimes referred to merely as disks), e.g., asdisclosed in U.S. Pat. No. 5,956,750, incorporated herein by reference.Disclosed in the disclosure is, more specifically, a disk subsystemwhich is made up of disks having different RAID levels such as RAIDS andRAID1 as devices (logical disks) to be accessed by a host computer, ormade up of disks having different access rates as actual magnetic disks(physical disks) of logical disks. A user can selectively use thedevices according to the access rates of the respective devices.

SUMMARY OF THE INVENTION

The present invention provides a method to migrate between “normalvolumes” and “virtual volume” while maintaining the benefits ofthin-provisioning. Migration from a normal volume includes determiningwhether a data block contains production data. A data block whichcontains production data is identified as a segment in the thinprovisioned volume. Those data blocks which do not contain productiondata are placed in a free segment list. Thereafter, data access can takeplace in the thin provisioned volume.

A further aspect of the present invention is migration of data from athin provisioned volume to a normal volume. Each segment allocated tothe thin provisioned volume is copied to a corresponding location in thenormal volume according to the logical block address associated with thesegment.

A further aspect of the present invention is creation of a normal volumehaving a bitmap to understand the modification of blocks within avolume. The volume is used on migration from normal volume to virtualvolume.

BRIEF DESCRIPTION OF THE DRAWINGS

Aspects, advantages and novel features of the present invention willbecome apparent from the following description of the inventionpresented in conjunction with the accompanying drawings, wherein:

FIG. 1 is a block diagram showing a configuration of a computer systemto which a first embodiment of the present invention is applied;

FIG. 2 shows a functional representation of the system configuration ofFIG. 1;

FIG. 3 shows information for defined parity groups;

FIG. 4 shows processing for SCSI write operations;

FIG. 5 shows configuration information for a thin provisioned volume;

FIG. 6 shows information for a free segment pool for thin provisionedvolumes;

FIG. 7 shows the processing for a write operation on a thin provisionedvolume;

FIG. 8 shows the data flow during a write operation in an LDEV;

FIG. 9 shows the processing for a read operation on a thin provisionedvolume;

FIG. 10 shows the data flow of a read operation on a thin provisionedvolume;

FIG. 11 shows a table of free LDEV's;

FIG. 12 shows configuration information for defined LDEV's;

FIG. 13 shows state changes during a migration from LDEV to LDEV;

FIG. 14 shows a table of pooled VDEV's;

FIG. 15 shows the flow for a migration operation between two VDEV's;

FIG. 16 shows a user interface for setting migration thresholds;

FIG. 17 illustrates triggering of migration;

FIG. 18 shows triggering for migration from an LDEV to a VDEV;

FIG. 19 shows triggering for migration from a VDEV to an LDEV;

FIG. 20 shows an example of an interface for recommending migrations;

FIG. 21 shows processing performed by a scheduler;

FIG. 22 shows the processing for migration operations between LDEV andVDEV;

FIG. 23 shows re-creation of the bitmap for an LDEV during migration ofdata from a VDEV to the LDEV;

FIG. 24 shows the flow of data during a migration from an LDEV to aVDEV;

FIG. 25 shows the flow of data during a migration form an LDEV to a VDEVthat does not involve copying data;

FIG. 26 shows the flow of data during a migration from a VDEV to anLDEV;

FIG. 27 shows the system configuration according to another embodimentof the present invention;

FIG. 28 shows the functional view of the configuration shown in FIG. 27;

FIG. 29 shows an external mapping table for externally defined LUNs;

FIG. 30 shows a mapping from external LUN designations to internal LUNdesignations; and

FIG. 31 illustrates an example of a parity group defined by externalLUNs.

DESCRIPTION OF THE SPECIFIC EMBODIMENTS

The first embodiment shows the migration from a Logical DEVice (LDEV)which is a volume comprising blocks of data on one or more physicaldevices to a Virtual DEVice (VDEV) which comprises on-demand allocatedsegments, or from VDEV to LDEV on host's initial write usingallocation-on-use technology.

FIG. 1 shows a diagram illustrating the hardware components andinterconnections among the components. One or more host systems 2, eachhas an operating system (OS) and a hardware configuration of aconventional computer system; e.g., PC, workstation, Mini Computer orMainframe. The host system includes a CPU 11, memory 12, and an internaldisk 13. The host system further includes a host bus adapter (HBA) 14for connection to a Fibre Channel (FC) switch 400 (or an Ethernet switchor the like). Each host system can store its data (e.g., production datacreated and used by applications such as a database) on a logical unit(LU) provided by a storage subsystem 30.

A console 402 is configured similarly to the host system 2, but may notbe configured with an HBA. The console 402 is in communication with thestorage system 30 over a suitable communication channel. For example,FIG. 1 shows that the console 402 is connected to a switch 401, which inturn is connected to the storage subsystem 30. The console providesremote administrative access to the storage subsystem, allowing a systemadministrator to maintain and otherwise manage the subsystem.

The storage subsystem 30 is configured to provide storage using SCSI-2,3command sets on its LU's. The storage system comprises several RAIDcontrollers (CTL) 20 and several physical storage devices 32. Thecontroller 20 comprises components such as a processor, memory, andnetwork interface cards (NICs) such as Ethernet cards or FC ports. Thecontroller provides SAN (storage area network) capability, or canprocess SCSI I/O requests to provide RAID-based access to the physicalstorage devices 32. An initial embodiment of the present invention isbased on open system using SCSI. However, it is clear that the inventioncan be applied to other systems; e.g., Mainframes using CKD (Count KeyData) Format.

The controller 20 typically includes non-volatile random access memory(NVRAM) and can store data to the NVRAM. The NVRAM can serve as a datacache that is protected against power failures using battery protectionfor memory. In case of power failure, for example, data on the NVRAM maybe de-staged to a storage configuration area on the physical storagedevices 32 using a backup battery as a power source. The controller canprovides FC ports which have WWN (World Wide Name) to specify the targetID as SCSI world, and consists of LUN on a FC port.

A management console 390 is typically provided for the customerengineer. It can be connected to the storage subsystem internally. Theconsole 390 provides GUI-based interface for the creation or deletion ofparity groups among the physical devices 32, and interfaces related touser administrator function like the creation or deletion of logicaldevice, of path between logical device and LU, and of path between LUand FC port.

FIG. 2 is a diagram illustrating a logical view of the softwarecomponents of the system shown in FIG. 1 and the interactions amongthem. The SAN 400 is logical connection between a given Host 10 andStorage Subsystem 30 using a switch or Hub like FC and Ethernet. Thiscapability is provided primarily by a fibre channel switch, a hub, anEthernet Switch or hub, etc. The LAN/WAN 401 is logical connectionbetween the Console 402 and Storage subsystem 30 using switches likeEthernet, FDDI, Token ring, and so on. The storage subsystem isconnected to LAN/WAN 401 to access from other host to manage storagesubsystem.

The storage subsystem 30 comprises various software components ormodules. The functions provided by the software can be enabled inmicrocode that executes in the controller 20. The program code can beprovided from an installation stored on optical media such as CD-ROM, orcan be obtained from FD or other remote devices like an Internetconnection to install microcode. The microcode comprises a conventionalparity group manager (not shown), a logical device manager (LDEV Mgr) 23that creates a logical device to provide a logical storage from physicaldiscs to an JO process 21, a Virtual Device Manager (VDEV Mgr) 22, and amigrater 24. Details of these processes are discussed further below.

The parity group manager is known, and thus not shown in FIG. 2. Thismodule is part of the microcode in the controller 20. The parity groupmanager defines and maintains parity group information for physicalstorage devices 32 using RAID0/1/2/3/4/5/6 technology. RAID 6, based onRAID 5 technology, provides dual parity protection. The created paritygroup is listed in an LDEV Config table 29 (FIG. 3). The information inthis table includes a parity group number 51 to identify the paritygroup within storage subsystem, a usable capacity size 52 created fromRAID technology, a RAID configuration 53, and the constituent physicalstorage devices 54. Additional information in the table is discussedbelow.

The LDEV manager 23 manages the structure of each LDEV and the behaviorof IO from the LU's. The LDEV presents a logical storage area for an LUto store and present data from/to host. The LDEV is part of a paritygroup. The administrator defines and initially formats a region of theLDEV adding the number of LDEV. The mapping between LDEV and paritygroup is stored in LDEV Config table 29 (FIG. 3). For each parity group(field 51 in the LDEV Config table 29), a record is maintained for eachLDEV in that parity group. The record includes an LDEV number 55 whichidentifies the LDEV, a start Logical Block Address (LBA) 56 whichrepresents the LDEV's start address in the parity group, and an end LBA57 which represents the LDEV's end address in the parity group.

The data used to represent an initialized volume can be ASCII “0”(zero). However, “0” is also sometimes used as the return value in aread function call to indicate an un-assigned segment in a VDEV (discussin later), which can create ambiguity. Therefore, another datarepresentation can be selected, e.g., NULL (\0), as the NULL fill valuein an initialized disk. This selection can be provided via the console402. After the LDEV is initialized, the state of initialization isstored in FMT field 58 of FIG. 3. In case of the initialization, themicrocode turns a format bit ON (“1”) to indicate the LDEV hasinitialized and not yet written to; the LDEV is said to be in an“initialized state.” If the bit is OFF (“0”), this indicates the LDEVhas been written to and thus is no longer in the initialized state.

Each LDEV is associated with a bitmap 26. Each bit in the bitmap 26corresponds to a block in the LDEV, and is initially set to OFF (e.g.,logic “0”). When data is written to the block, the corresponding bit isset to ON (e.g., logic “1”). More generally, blocks which have beenallocated to stored data for application on the host or which are usedby the operating system on the host to manage a file system are referredto as production data. These blocks are referred to as allocated blocks.Data contained in blocks which are not allocated for application dataand which are not used by the operating system can be referred to asnon-production data. These blocks are referred to as unallocated blocks.

Where an LDEV comprises a large number of blocks, the blocks can begrouped into a smaller number of block-groups. This helps to keep thebitmap at a smaller more convenient size. For example, an LDEV mightcomprise 256×2¹⁰ blocks, which would require a 256 kilobit bitmap.Instead, if each bit corresponds to 256 blocks, then the bitmap needonly be 1 kilobit in size.

If an LDEV does not have a corresponding bitmap defined for it, asuitable process can be provided which allows a system administrator tocreate one. This can be requested via the Console 402. The LDEV manager23 would read each block (or group of blocks) from the LDEV and eitherset the corresponding bit to OFF if the block (or group of blocks) hasnot been written (i.e., the data block is filled with NULLs), or set thecorresponding bit to ON if the block (or at least one of the group ofblocks) has been written. This aspect of the present invention isappropriate for existing storage systems (so-called legacy systems)which are not initially configured for data migration processing inaccordance with the present invention.

To accommodate the bitmap, the procedure for performing a SCSI writecommand is modified as shown in FIG. 4. Thus, in a Step 131, data iswritten to the LDEV via the LU specified by start LBA and size, inresponse to a write command. In a step 132, the corresponding bit in thebitmap corresponding to the LDEV is set to ON. Upon the first writefirst to an initialized LDEV, the microcode needs to indicate the factthat the LDEV is no longer in an initialized state. Thus, in the case ofthe first of SCSI write command for the LDEV, the microcode makes a noteof this occurrence. Recall in FIG. 3, the FMT field 58 shows whether theLDEV is in the initialized state (“1”) or not (“0”). After the firstwrite operation is performed on the LDEV, the FMT field 58 is changed to“0” to indicate the volume has been written to or otherwise modified,and is therefore no longer initialized. As will be explained below, thisFMT field 58 is used on migration for empty data from VDEV to LDEV.

The Virtual Device (VDEV) Manager 22 creates and managesthin-provisioned volumes as virtual devices to provide LUs that arebased on virtual devices. When a write operation to avirtual-device-based LU requires the allocation of another block, theVDEV manager 22 allocates a storage segment from a segment pool 27-1(see FIG. 6). The segment manager 27 manages the segment pool 27-1.

A storage segment is either “allocated” or “not allocated”. FIG. 2 shows“allocated” segments 37 and “not allocated” segments 38. An allocatedsegment contains data. The VDEV manager 22 maintains an allocation table27-0 (FIG. 5) to manage the Virtual LBA(VLBA) space for the virtualdevice that are defined by the thin provisioned volumes. The allocationtable 27-0 includes a VDEV number field 141 which identifies the virtualdevice. A host visible size field 142 can be initialized using theSCSI's READ Capacity command. The allocation table 27-0 also stores arecord for each storage segment that is allocated to a virtual device.Each record includes a start VLBA field 143 which indicates startingaddress in the virtual device that the storage segment represents, aSegment Size field 144 which indicates the size of each segment, and aSegment number field 145 which identifies the storage segment in thesegment pool 27-1. If a segment does not contain data (i.e., has notbeen written to), then the Segment number field will be some undefinedvalue that indicates the segment has not been written and thus not yetallocated; e.g., “−1”.

The “not allocated” segments (or “free” segments) are created from oneor more LDEVs. Each LDEV is divided into a plurality of segments andadded to the free segment pool 27-1. The free segment pool comprises asegment number field 146 which uniquely identifies the segment among allof the segments; this typically is simply a sequential numbering of thesegments comprising the LDEV. An LDEV field 147 identifies the LDEV fromwhich a particular segment originates. The LBA field 148 and SegmentSize field 149 identify the location of a segment in the LDEV.

FIG. 7 shows the processing for performing a write operation on avirtual-device-based LU. In a step 111, a determination is made whetherthe target of the write operation has been allocated a storage segmentor not. If not then the process continues at a Step 112, otherwiseprocessing proceeds to a Step 113. At Step 112, a storage segment isallocated from the free segment pool 27-1. Then in Step 113 the writeoperation is performed.

Step 111 involves an inspection of the allocation table 27-0 (FIG. 5).The entry for the virtual device (VDEV) that corresponds to the LU isconsulted. The target address of the write operation is used to searchthe VLBA field 143. If the Segment number field 145 is not filled in(e.g., set to “−1”), then a storage segment has not yet been allocated.

An important aspect of this thin provisioning aspect of the presentinvention is that the thin provisioned volume is dynamically expanded asstorage is needed, and that the expansion occurs automatically withoutuser involvement.

FIG. 8 illustrates the processing of the flowchart of FIG. 7. Forexample, a write operation issues for VDEV 115, targeting LBA 22520 inthe VDEV. Assuming the storage segment 116 corresponding to the targetaddress of 22520 has not yet been allocated, the VDEV manager 22allocates a segment (#301) from the free segment pool 117. FIG. 8 alsoshows an underlying LDEVs 201 that is configured to implement the freesegment pool. The LDEV 201 is partitioned into appropriately sizedsegments. Each of the segments is numbered and listed in the table 27-1(FIG. 6) and thus collectively constitute the free segment pool 117.

FIG. 9 shows the actions performed for a read operation. FIG. 10illustrates the processing of FIG. 9. Thus, in a Step 101, adetermination is made whether the storage segment that corresponds tothe target LBA of the read operation has been allocated or not. If not,then in a Step 103, a suitable NULL response is returned indicating thatthe target LBA is an unwritten area in storage. Typically, the responseincludes the amount of data read, which in this case is zero. The valueis defined in Console 42 when the LDEV is initialized. On the otherhand, if the target LBA falls within the address range of an allocatedstorage segment, then the data in the storage segment is read out andreturned, Step 102.

The determination made in Step 101 is made by consulting the allocationtable 27-0. First, the VDEV that corresponds to the accessed LU isdetermined, thus identifying the correct entry in the VDEV field 141.The target LBA is compared to the start VLBA fields 143 of thecorresponding VDEV to identify the corresponding storage segment. TheSegment number field 145 is then consulted to determine if the segmenthas been allocated or not; processing then proceeds to Step 102 or Step103 accordingly.

FIG. 10 shows the situation where the target LBA accesses a previouslyallocated storage segment. A read request is shown targeting LBA 22520which maps (via allocation table 27-0) to segment 106. Segment 106 isshown to reside on LDEV 201 at the block location 107. The actual datafor the read operation is then read from LDEV 201.

The IO process 21 processes IO requests made to an LU from a host. TheIO process 21 comprises a component (not shown) for handling SCSI I/Ooperations. The JO process includes a table 25 (FIG. 12) that maps LUsto ports in the storage subsystem 30. The table 25 is used by thecontroller 20 to coordinate information between ports and LUs. The tableincludes a port number field 81 to identify the physical FC port, a WWNfield 82 which associates the world wide name (WWN) to the port, alogical unit number (LUN) field 83, and a device name field 84.

The Migrater 24 performs migration operations to move data between LDEVsand VDEVs according to the present invention. The migration operationsinclude migrating data between LDEVs, migrating data from an LDEV to aVDEV, migrating data from a VDEV to an LDEV, and migrating data betweenVDEVs.

In the migration of data from a first LDEV to a second LDEV, theadministrator specifies an LU as the source LDEV and he selects a targetLDEV. The target LDEV is selected from the free LDEV pool 173 (FIG. 11)via a suitable interface provided on console 390 or console 402. Thefree LDEV pool 173 shows the change in state for each LDEV. There arethree states: One state is “Used LDEV” 172 which indicates those LDEVsthat been assigned to an LU or to a free segment pool 27-1 (as discussedabove, and discussed further below). Another state is “Free LDEV” 173which indicates those LDEVs that are not assigned to an LU or to a freesegment pool 27-1. The final state is “Reserved LDEV” 174 whichindicates those LDEVs in an intermediate state of operation. Morespecifically, these LDEVs are those which had been allocated for amigration operation which is still in progress.

The Migrater 24 can schedule a task to reserve the target LDEV and toperform the migration operation. When the migration task executes, theMigrater 24 creates a pair of mirror between the source LDEV and thetarget LDEV. During mirroring, the host's write IO is sent to the sourceLDEV and to the target LDEV, setting bits in the associated bitmap thatcorrespond to blocks written on the target LDEV and the block of copyfor the host written block which have already written by host is skip.If migration is performed in an “online” manner, then the Migrater 24suspends hosts JO directed to the source LDEV after completion of themirror operation, and splits the mirror pair. The Migrater 24 thenchanges the LU designation that is used by the host to point to thetarget LDEV. The source LDEV then becomes a free LDEV. If migration isperformed in an “offline” manner, then the Migrater 24 simply continuesto process IOs for the source LDEV upon completion of the datamigration. Performing “offline” migration allows the administrator tore-use the target LDEV; e.g., connecting it to another LU, or the LU mayhave been already assigned to and LDEV before the mirror operation.

FIG. 13 shows the operation of the change state on migration. In Step 1,the Migrater 24 reserves a target LDEV 187 and enters a migration taskto the scheduler. Then in Step 2, the scheduler invokes the task andstarts to migrate data from used LDEV 186. This includes mirroring datafrom the source LDEV to the reserved LDEV which is the target LDEV 187.Of course, during the minoring, the host's write JO is sent to sourceLDEV and to the target LDEV. If migration is on-line, source 10 issuspended and path is changed to target LDEV. After the mirroring,target LDEV is changed to a used LDEV state and the source LDEV ischanged to a Free LDEV state in Step 3.

To migrate data from one VDEV to another VDEV, the administratorspecifies a target LU on the console. To ensure that the data migrationoccurs properly, there is the idea of a VDEV number. The controller 20has a table of Pooled VDEV 28-1 (FIG. 14) to manage the state of theVDEVs. The table includes a “Used” VDEV number field 475 that shows theVDEVs which have already been assigned to an LU, a “Reserved” VDEV field476 that shows the VDEV number of the target VDEV that has been reservedfor the migration operation, and a “Free” VDEV field 477 that showsVDEVs which have not been assigned to an LU.

During a migration operation, Migrater 24 on storage subsystem 30 picksa free VDEV from the Free VDEV field 477 in the VDEV pool 28-1, and movethe VDEV number of the selected VDEV to the Reserved VDEV field 476. Amigration task is then created and scheduled. The migration task isexecuted as shown in Step 1 in FIG. 15.

When task is executed, Migrater 24 allocates a new storage segment (Step2.1 in FIG. 15) and copies data by each segment from segment on sourceVDEV to the new segment on target VDEV (Step 2.2 in FIG. 15). Of courseduring the copying, the host's write IO is sent to source VDEV and tothe target VDEV to also write data on target VDEV. If migration isperformed in an “online” manner, then the host will be “connected” tothe target VDEV upon completion of the migration. The Migrater 24suspends the host's IOs after completing copying of all the segmentsfrom the source VDEV to the target VDEV. The LU designation that is usedby the host to access the volume is changed to point to the target VDEV(Step 3.1 in FIG. 15). The VDEV number of the target is moved from theReserved VDEV field 476 (FIG. 14) to the Used VDEV field 475. Thesegments in the source VDEV are put into the free segment pool 117 andthe source VDEV number is moved to the Free VDEV 477 field 477 (Step 3.2in FIG. 15). If migration is performed in an “offline” mode, then theMigrater 24 continues to process IOs using the source VDEV. Theadministrator can re-use the target VDEV after split of the pair andassigning an LU to a VDEV or the LU may have been assigned to the VDEVbefore the copy in the case of OFFLINE operation; Step 1 in FIG. 15.

The scheduler that is used to schedule the migration tasks is typicallyprovided by the OS. For example, the “cron” utility is provided onUNIX-based OSs. The Windows® operating system from Microsoft alsoprovides for task scheduling. As mentioned, user access to schedule andotherwise monitor migrations tasks can be provided by the console 390 inthe storage subsystem 30, or remotely via the console 402.

Typical operation of the present invention involves a user (e.g., acustomer service engineer) creating a parity group from among thephysical storage devices 32. Next, a system administrator creates aplurality of LDEVs from the parity group. The administrator assigns atleast one of the LDEVs to the free segment pool. The storage subsystem30 then divides the LDEV, according to predetermined segment sizecriteria, into a plurality of segments which constitute the free segmentpool. To create a VDEV, the administrator picks a VDEV number from VDEVnumber pool 477 in FIG. 14 and a size for the VDEV. To access an LU fromthe host, the administrator defines a path between the VDEV or LDEV andthe LU.

A migration operation of date from an LDEV to a VDEV requires that atleast one LDEV is associated with an LU. The free segment pool must havefree segments for allocation. There must be an available VDEV in theVDEV pool 477 (FIG. 14) for allocation. Similarly, a migration operationof data from a VDEV to an LDEV requires a VDEV that is associated withan LU. A free LDEV from the LDEV pool 173 (FIG. 11) must be availablefor allocation.

Before migration commences, the administrator needs to know which LDEVor VDEV is best to use and must create a task in the scheduler toinitiate the migration process. The basic logic is that the storagesubsystem performs scheduled checks of the rate of written data to an LUcomprising VDEVs or to an LU comprising LDEVs on a storage subsystem,e.g., on a monthly basis, quarterly, or the like. The storage subsystemchecks the rate of the allocated segment among the segments in the VDEVand checks turned-on bits in the bitmap for the LDEV (indicating thatthe corresponding segment for LDEV was modified since the initial formatof the LDEV).

FIG. 16 shows a graphical interface that can be used to set a threshold231 (more generally a criterion) for activating the migration process.In the example shown, the value entered in the field 231 represents thepercentage utilization of an LU that will trigger a migration. Forexample suppose the value is 50%, and suppose the LU is initiallyassociated with an LDEV. If the amount of storage used on the LDEV fallsbelow 50%, then this will trigger a migration of data from the LDEV to aVDEV, where the LU is then associated with the VDEV after the migration.If later the usage of the LU (now associated with a VDEV) rises above50%, then this could trigger a migration of the data back to an LDEV,when the LU is then associated with the LDEV. The GUI shown in FIG. 16can include a field (not show) that specifies how often to perform acheck of the usage level of the LDEV or VDEV that the LU is currentlyassociated with.

Since data migration is a large undertaking, it may be more practical tosimply recommend to the system administrator that a migration operationis indicated for an LU, rather than autonomously performing themigration. The system administrator can make the final decision based onthe recommendation.

FIG. 17 shows the processing by which a migration is triggered. Thisprocess can be periodically performed at a predetermined rate, oraccording to a schedule; either of which can be user-specified. In astep 201, a check is made whether the criteria for migrating data froman LDEV to a VDEV has been met. This is discussed in further detail inFIG. 18. In a step 202, a check is made whether the criteria formigrating data from a VDEV to an LDEV has been met. This is discussed infurther detail in FIG. 19. If there is an alert list (step 203), theneach user in the alert list is notified in a step 204. The notificationcan be made by any of numerous ways; e.g., email, fax, pager, SNMP trap,etc. Thus, the example shown in FIG. 16 illustrates a simple criterionfor deciding when to perform a migration, namely, monitoring the usagelevel. For discussion purposes, this simple criterion will be used as anillustrative example. It can be appreciated however, that other criteriacan be readily employed.

FIG. 18 shows the processing for determining which LDEVs are migrated.In a step 206, a check is made whether each LDEV has been examined formigration. If all the LDEVs have been examined, then the process ends.Steps 207 and 208 constitute an example of a criterion (indicated by thedashed lines) for triggering migration or making a recommendation toperform a migration. Step 207 checks the number of bits that are turnedon in the bitmap corresponding to the LDEV being examined. Thisindicates the usage level of the LDEV. For example, the usage rate mightbe computed as:

usage rate(LDEV)=turned on bits/total # of bits*100

In step 208, if the usage level falls below a threshold percentage (asset in FIG. 16, for example, the threshold would use a dedicatedthreshold for VDEV like Y independent from X. In this case, there is nosuggestion of migration between X and Y threshold), then the LU that isassociated with this LDEV is scheduled or recommended for data migrationto a VDEV. Processing continues to step 206 to examine the next LDEV.

FIG. 19 shows the processing for determining which VDEVs are migrated.In a step 211, a check is made whether each VDEV has been examined formigration. If all the VDEVs have been examined, then the process ends.Steps 212 and 213 constitute an example of a criterion (indicated by thedashed lines) for triggering migration or making a recommendation toperform a migration. Step 212 checks the number of segments that havebeen allocated to the VDEV being examined. This indicates the usagelevel of the VDEV. In step 213, if the usage level rises above athreshold percentage (as set in FIG. 16, for example), then the LU thatis associated with this VDEV is scheduled or recommended for data. Forexample, the usage rate might be computed as:

usage rate(VDEV)=assigned segments/total # of segments*100

This indicates the usage level of the VDEV. In step 213, if the usagelevel rises above a threshold percentage (as set in FIG. 16, forexample), then the LU that is associated with this VDEV is scheduled orrecommended for data migration to an LDEV. Processing continues to step211 to examine the next VDEV migration to an LDEV. Processing continuesto step 211 to examine the next VDEV.

As another criterion for step 207/212 and step 208/213, we may usenumber of read/write access for an LDEV or a VDEV to determine activityin the LDEV or VDEV. Migration of data from an LDEV to a VDEV can beperformed if there are too few read/write accesses to the LDEV. In thecase of data from a VDEV to the LDEV, migration can be performed ifthere are many read and write accesses. In this operation, anAdministrator defines a threshold X of the counter for migration timingof LDEV, and the threshold indicates that the VDEV migrates to LDEV. TheAdministrator also defines a threshold Y of the counter for VDEV and thethreshold indicates that the LDEV migrates to VDEV. Each VDEV and LDEVhas a counter of accessed I/O number for periodically monitoring withinterm like a week, a month, a quarter or a year. The counter watches eachread and write IO access and increases the count until the microcodechecks the recommendation like Step 208/213 after the eachrecommendation, the counter is reset.

As same as step 208, the microcode checks the usage level for thecounter with the defined threshold. If the counter is above a thresholdpercentage X, the microcode code recommends to migrate data to LDEV.Also as same as step 213, the microcode checks the usage level for thecounter with the defined threshold. If the counter falls below athreshold percentage Y, the microcode code recommends to migrate data toVDEV.

FIG. 20 shows an example of a GUI that lists the recommendations formigration. The interface shows a source LU field 221 which identifiesthe LU that contains the data that is the object of possible migrationoperation. A target device field 222 identifies the target of the datamigration. A configuration field 223 indicates whether the deviceidentified in the LDEV field 222 is configured as an LDEV or a VDEV.These fields are obtained from the table 25 shown in FIG. 12. Arecommendation field 224 shows the results of the processing outlined inFIGS. 17-19. A usage field 225 shows amount of used space on an LDEV, orin the case of a VDEV the amount of allocated segments. In the figure,the usage is expressed as a percentage of the total available space orsegments. A request migration field 226 is an input field that allowsthe user to select an LU for migration or not.

The GUI shown in FIG. 20 can be enhanced to allow the user to select thetarget LDEV or VDEV, by specifying an LDEV number in the target in thefield 222. The GUI can be enhanced with a field that specifies “online”migration, meaning that when an LU has migrated it data to the target,the LU is then assigned to that target for subsequent IO.

When the Apply button is “activated” by the user via a mouse click, forexample, any selected migration operations are then scheduled. FIG. 21shows the processing performed by the scheduler. This is a standard waitloop which looks for tasks that are scheduled.

FIG. 22 shows the processing for a migration operation 150, whichcomprises the following:

-   -   Step 151: The Migrater 24 creates a pair of a source device and        a target device.    -   Step 152: A check is made on the direction of migration. If the        migration is from an LDEV to a VDEV, then processing continues        at Step 153. If migration is from a VDEV to an LDEV, then        processing continues at Step 154.    -   Step 153: The Migrater 24 copies data from the source LDEV to        the target VDEV based on the corresponding bitmap. The migration        continues until data between the source device and the VDEV is        synchronized. If a host sends a write 10 to the source during        the copying, the data for the write 10 is also sent to the        target to write data after the allocation of a segment.    -   Step 154: The Migrater 24 allocates segments and copies data        from the source VDEV to the target LDEV based on allocated        segment table 27-0 (FIG. 5). If the host sends a write 10 to the        source during the copying, the data for the write 10 is also        sent to the target to write data, turning ON the bit in the        LDEV's bitmap that corresponds to the written segment. Also, the        Migrater fills empty segments (shown as “−1” in the Segment        field 145 in FIG. 5) in the LDEV with a fill character.        Typically, the NULL fill value which is ASCII “0” (zero) or the        NULL character (\0) is the same as the LDEV's formatted value to        indicate an empty block. Regarding the Migrater filling empty        segments in LDEV, we assume that the volume is not        un-initialized by the NULL when some of the bits in the bitmap        are “1”. If the volume is initialized by the NULL when all of        the bits in the bitmap are “0”, the filling operation is        skipped. This check is done before Step 154. FIG. 3 includes a        FMT field 58 to indicate if the LDEV is in the initialized state        (“1”) or not (“0”).    -   Step 155: The Migrater 24 creates a bitmap table for target        device.    -   Step 156: The Migrater 24 confirms whether the migration task is        an online operation or an offline operation. If the task is an        online migration, this procedure goes to Step 157. If the task        is an offline migration, this procedure goes to Step 159.    -   Step 157: The Migrater 24 suspends the source and target. If the        host issues an IO operation, it will be placed in a wait state        until Step 158 is performed.    -   Step 158: The Migrater 24 changes the path from source device to        target device. The host can then resume with its IO operations.    -   Step 159: The Migrater 24 discards the pair.

FIG. 23 shows the re-creation of a bitmap for an LDEV that was thetarget of a migration operation, performed in step 155 above. After thedata has been copied over to the target LDEV from the source VDEV (step161), a bitmap for the LDEV must be created. In a step 162, the Migrater24 gets a next segment from the allocated segments and turns on thecorresponding bits in the bitmap associated with the LDEV.

FIG. 24 shows the data flow for migration of data from an LDEV to a VDEVresulting from the migration process of FIG. 22. The Migrater 24 createsa pair relationship between the VDEV and the LDEV. Data is then copiedfrom blocks in the source LDEV based on the bitmap table 26corresponding to the source LDEV. Prior to the copy operation of a blockof data from the LDEV, the Migrater allocates a segment from the freesegment pool and creates an entry segment in the allocated segment table27-0 associated with the VDEV. The block of data from the LDEV is thencopied to the allocated segment in the VDEV. When the migration iscomplete the LDEV can be re-assigned to another LU. The VDEV isassociated with the LU that was originally associated with the LDEV inthe ONLINE case or is associated with the another LU in case of OFFLINEoperation.

FIG. 25 shows an embodiment which avoids the copying of data. The sourceLDEV is identified by way of the LU designation associated with theLDEV. An available VDEV number is selected from the table 28-1 (FIG. 14)and thus identifies the target VDEV. Basically, the bitmap correspondingto the source LDEV is converted to the table 27-0 (FIG. 5) and the freesegment pool of the target VDEV. The Migrater 24 proceeds down thebitmap associated with the target LDEV. The VDEV number gets us into acorresponding VDEV entry (field 141) of the table 27-0 (FIG. 5). Foreach bit that is set (i.e., ON), indicating there is data in thecorresponding block, the sequence number of the corresponding block isentered into the appropriate entry in the Segment field 145 of the table27-0, using the LBA address of the corresponding block as a key into thetable 27-0. The sequence numbers of the blocks in the LDEV whose bit isnot set are entered into the free segment pool 117. In this way, thereis no actual copying of data from the source LDEV to the target VDEV.

FIG. 26 shows the data movement and the creation of a bitmap for thetarget LDEV during a migration from a VDEV to an LDEV, as shown in theprocess flow of FIG. 22. A copy pair is created comprising the sourceVDEV and the target LDEV. Using the entry in table 27-0 (FIG. 5) thatcorresponds to the source VDEV, each segment in the VDEV is copied tothe LDEV at the address indicated in the VLBA field 143. And if the LDEVis not formatted; the state of the FMT field 58 in FIG. 3 is “0”. Themicrocode fills data for the segment addressed region, indicated by theStart LBA field 56 and the End LBA field 57 in FIG. 3 when the microcodeencounters a “−1” value in Segment field 145 of FIG. 5.

In accordance with a second embodiment of the present invention, thestorage subsystems 32 (FIG. 1) are external storage systems. The benefitfor this configuration is the added flexibility of using an externalstorage resource. FIG. 27 shows a system configuration according to thisembodiment. One or more host systems 2, each has an operating system(OS) and a hardware configuration of a conventional computer system. Thehost system includes a CPU 11, memory 12, and an internal disk 13. Thehost system further includes a host bus adapter (HBA) 14 for connectionto a Fibre Channel (FC) switch 35 (or an Ethernet switch or the like).Each host system can store its data (e.g., production data created andused by applications such as a database) on a logical unit (LU) providedby a storage subsystem 40.

The storage subsystem 40 is configured to provide storage using SCSI-2,3command sets on its LUs. The storage system comprises several RAIDcontrollers (CTL) 45 and several physical storage devices 49. Thecontroller 45 comprises components such as a processor, memory, andnetwork interface cards (NICs) such as Ethernet cards or FC ports (notshown). The controller provides SAN (storage area network) capability,or can process SCSI I/O requests to provide RAID-based access to thephysical storage devices 49.

The controller 45 typically includes non-volatile random access memory(NVRAM) and can store data to the NVRAM. The NVRAM can serve as a datacache that is protected against power failures. In case of powerfailure, for example, data on the NVRAM can be de-staged to a storageconfiguration area on the physical storage devices 49 using a backupbattery as a power source. The controller can provides FC ports (e.g.,port 46) which have WWN (World Wide Name) to specify the target ID asSCSI world, and consists of LUN on a FC port. An additional port 47 isprovided for connection to an external storage system 30 via a switch91. The external storage system 30 comprises external storage devices32.

FIG. 28 shows a functional view of the system of FIG. 27. The externalstorage subsystem 30 defines logical units (LUs). A mapping table 240(FIG. 30) provides access to the internal LUs defined by storagesubsystem 30 from storage subsystem 40. The mapping table includes anexternal LUN field 241 which contains LU numbers (LUNs) that are used bythe storage subsystem 40 to access the LUs of storage subsystem 30. ASize field 242 indicates the size of the LU. A WWN field 243 stores theWWN to access an LU. An internal LUN field 244 represents the LU numberused internally by the storage subsystem 30.

The storage subsystem 40 includes a Disc-External LU mapping table 230(FIG. 29) which provides a mapping capability to see the external LUsdefined on storage subsystem 30. The mapping table 230 is the same asthe table shown in FIG. 3. The Disk number field 234 points to theexternal LU number field 241 of the mapping table 240.

As an example of how these mappings can be used, consider the exampleshown in FIG. 31. A one terabyte (TB) logical unit can be defined instorage subsystem 40 comprising logical units in storage subsystem 30.Parity group 3 in mapping 230 shows such a configuration. The 1 TB LUNcomprises LUNs Ex_(i) and Ex₂ of storage subsystem 30. As can be seenfrom mapping 240, LUN Ex_(t) is a 500 gigabyte (GB) LUN, as is LUN Ex₂.

The header information 511, 512 has an offset of LBA (Logical BlockAddress), which is the address space of LU, the unit and otherfunctionality information for the LU is 5 MB. The 5 MB is an example ofheader file. We may extend the size based on new information. The headeris used for data space on the parity group number, belonging paritygroup, size, affiliation LU (port and LUN on port), number of logicaldisc, configuration (concatenation, RAID0/1/2/3/4/5/6, and etc),Sequence of LU for the configuration, and Data space of which size is atotal of LU size minus header for the LDEV.

The migration operation for this embodiment of the present invention isthe same as discussed above. The fact that there is an external storagesubsystem is hidden by the use of the external LUN mapping providedbetween mappings 240 and 230.

Further detail regarding the processing for thin provisioned volumes isdisclosed in commonly owned U.S. Pat. No. 6,725,328 which is hereinincorporated by reference in its entirety for all purposes.

1-2. (canceled)
 3. A storage system comprising: a first port for receiving commands from a host; a second port for transferring data and commands to a plurality of storage devices; a processor; and a memory storing programs, wherein said programs control a plurality of virtual devices, for which allocations from a pool are performed in response to a write operation, wherein said programs manage a plurality of logical devices, of which segments are allocated to said plurality of storage devices and are associated with logical block addresses, where said plurality of logical devices present a logical storage area for a logical unit to store and present data to and from said host, wherein said programs process a first migration from a first virtual device of said plurality of virtual devices to a first logical device of said plurality of logical devices, and wherein when said programs process said first migration, a pair is created between said first virtual device and said first logical device, and data stored in said first virtual device is copied to a portion of said first logical device.
 4. The storage system according to claim 3, wherein said plurality of logical devices are defined by an administrator, and mapping between said plurality of logical devices and parity groups is stored in said memory, and wherein in response to write operations, if a target of the write operation has not been allocated a storage segment, a storage segment is allocated from said pool before the write operation is performed.
 5. The storage system according to claim 4, wherein remainder of said portion of said first logical device not copied is filled by “0,” if the first logical device is not formatted.
 6. The storage system according to claim 3, wherein said programs process a second migration from a second logical device of said plurality of logical devices to a second virtual device of said plurality of virtual devices, and wherein when said programs process said second migration, a copy pair is created between said second logical device and said second virtual device.
 7. The storage system according to claim 6, wherein said plurality of logical devices are associated with a bitmap, and said bitmap indicates whether blocks within said plurality of logical devices have stored data or not, and wherein during said second migration, said bitmap is checked and copy is performed from said second logical device to said second virtual device using said bitmap.
 8. The storage system according to claim 7, wherein if said bitmap indicates that there is no stored data, copy is not executed against corresponding region of said second logical device to said second virtual device.
 9. The storage system according to claim 3, wherein when said second migration is performed, a bitmap is created for said first logical device.
 10. The storage system according to claim 3, wherein said plurality of storage devices are external to said storage system, wherein said plurality of storage devices are magnetic disks, and wherein said migrations are recommended or scheduled based on a usage level of each logical device or each virtual device.
 11. A method for controlling a storage system, comprising a first port coupled to a host, a second port coupled to a plurality of storage devices, a processor, and a memory, the method comprising: providing a plurality of thin-provisioned volumes to said host, wherein in response to a write operation, storage segments for said plurality of thin-provisioned volumes are allocated from a pool; managing a plurality of logical devices, of which segments are allocated to said plurality of storage devices and are associated with logical block addresses, wherein said plurality of logical devices present a logical storage area for a logical unit to store and present data to and from said host; and performing a first migration from a first thin-provisioned volume of said plurality of thin-provisioned volumes to a first logical device of said plurality of logical devices, wherein when performing said first migration, a pair is created between said first thin-provisioned volume and said first logical device, and data stored in said first thin-provisioned volume is copied to a portion of said first logical device.
 12. The method according to claim 11, wherein said plurality of logical devices are defined by an administrator, and mapping between said plurality of logical devices and parity groups is stored in said memory, and wherein in response to write operations, if a target of the write operation has not been allocated a storage segment, a storage segment is allocated from said pool before the write operation is performed.
 13. The method according to claim 12, wherein remainder of said portion of said first logical device not copied is filled by “0,” if the first logical device is not formatted.
 14. The method according to claim 11, wherein said programs process a second migration from a second logical device of said plurality of logical devices to a second thin-provisioned volume of said plurality of thin-provisioned volumes, and wherein when said programs process said second migration, a copy pair is created between said second logical device and said second thin-provisioned volume.
 15. The method according to claim 14, wherein said plurality of logical devices are associated with a bitmap, and said bitmap indicates whether blocks within said plurality of logical devices have stored data or not, and wherein during said second migration, said bitmap is checked and copy is performed from said second logical device to said second thin-provisioned volume using said bitmap.
 16. The method according to claim 15, wherein if said bitmap indicates that there is no stored data, copy is not executed against corresponding region of said second logical device to said second thin-provisioned volume.
 17. The method according to claim 11, wherein when said second migration is performed, a bitmap is created for said first logical device.
 18. The method according to claim 11, wherein said plurality of storage devices are magnetic disks, and wherein said migrations are recommended or scheduled based on a usage level of each logical device or each thin-provisioned volume. 